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A  SCHEMA-BASED  MODEL  OF  INFORMATION  PROCESSING 
FOR  SITUATION  ASSESSMENT 


The  research  reported  here  tests  a  model  of  human  information  processing 
that  links  the  properties  of  presented  information  to  the  situation  assessments 
based  on  this  information  (Noble,  1985).  The  model  focuses  on  the  kind  of 
information  processing  required  to  support  judgmental  processes  that  are  based 
primarily  on  experience  and  situation  recognition.  These  include  processes  that 
may  underlie  "intuitive"  decision  making.  This  model  is  based  primarily  on 
schema*  theory,  though  it  also  oraws  from  other  cognitive  psychology  notions. 

Schema  are  memory  structures  used  for  information  processing  that  enable 
people  to  use  their  experience  to  recognize  and  interpret  situations,  understand 
language  and  stories,  make  decisions,  and  solve  problems.  A  well-known  type  of 
schema  is  the  script  (Bower,  Black  and  Turner,  1979;  Pryor,  1985)  Scripts  are 
time-event  models  of  familiar  experiences.  The  events,  which  partition  the 
script  into  major  scenes,  may  be  scripts  themselves.  Schema,  in  general,  need 
not  be  time  or  event  oriented.  They  are  characterized  by  variables,  a  hierarchy 
of  embedding,  and  varying  levels  of  abstraction  which  "attempt  to  represent 
knowledge  in  the  kind  of  flexible  way  which  reflects  human  tolerance  for 
vagueness,  imprecision,  and  quasi-inconsistencies"  (Rumelhart,  1977).  As 
recognition  devices  their  "processing  is  aimed  at  the  evaluation  of  their  good¬ 
ness  of  fit  to  the  data  being  processed"  (Rumelhart,  1980). 


*  We  use  schema,  rather  than  schemata,  as  the  plural. 


Several  researchers  have  shown  that  people  use  schema  to  understand 
natural  language  and  stories  (van  Dijk  and  Kintch,  1983;  Rumelhart,  1981; 
Thornkyke,  1977).  Thorndyke  showed  that  there  exist  schema  that  define  a  stan¬ 
dard  structure  for  stories.  Stories  with  this  structure  were  easier  to 
understand  than  those  with  a  different  structure.  Van  Dijk  and  Kintch' s  model 
of  language  comprehension  explains  how  multiple  schema  interacting  within  a 
linked  hierarchy  enables  people  to  understand  words,  sentences,  and  paragraphs. 
Rumelhart  showed  that  stories  seem  to  be  understood  in  terms  of  explanatory 
schema  evoked  by  key  words. 

Schema  have  also  been  shown  to  guide  actual  judgment,  behavior  and  deci¬ 
sion  making  in  cognitive,  social,  and  clinical  psychology  (Abelson,  1981).  They 
have  been  shown  to  be  useful  in  several  cases  where  problem  solving  is  based  on 
the  ability  to  use  methods  that  worked  previously  in  similar  situations.  These 
cases  have  included  understanding  and  solving  arithmetic  word  problems  (Kintsch, 
1985),  solving  algebra  problems  based  on  their  propositional  structure  (Mayer, 
1982),  and  finding  promising  solution  strategies  for  geometry  and  maze  problems 
based  on  their  surface  features  (Lewis,  1985).  They  have  also  been  shown  to 
account  for  differences  between  expert  and  novice  approaches  to  solving  physics 
problems  (Larkin,  1983),  and  to  account  for  some  flawed  heuristics  and  biases 
associated  with  human  information  processing  (Tversky,  1980;  Kahneman,  1973; 
Tversky,  1983).  In  their  papers,  Tversky  and  Kahneman  showed  that  people  seem 
to  try  to  establish  schema  that  can  account  for  observed  data,  and  then  to  use 
schema  for  reasoning.  A  classic  paper  on  chess  expertize  (Chase  and  Simon, 
1973),  though  not  explicitly  a  "schema"  paper,  also  shows  that  expertize  can  be 
based  upon  the  ability  to  recognize  chess  patterns  associated  with  previously 
learned  good  moves. 


The  knowledge  contained  in  schema  can  be  applied  to  a  particular  problem 
only  if  that  problem's  relevance  to  a  schema  can  be  recognized.  Since  schema 
enable  particular  instances  to  be  recognized  as  belonging  to  a  class  of  instan¬ 
ces,  schema  can  be  regarded  to  be  partly  classification  devices  (Abelson,  1981). 
The  present  research  adopts  a  probabilistic  view  of  classification  (Smith, 

1981),  in  which  an  object  or  concept  is  classified  when  enough  of  its  weighted 
features  match  the  set  of  features  associated  with  the  category.  The  features 
can  be  at  several  levels  of  abstraction  Tversky,  1984;  Larkin,  1983)  They  may 
be  physical  parts,  properties  such  as  symmetry  or  color,  and  functions  (Gati, 
1984).  It  is  not  presently  understood  how  people  are  able  to  recognize  the 
features  used  in  classification.  One  possible  mechanism  could  be  based  on  a 
hierarchy  of  similarity  assessments  (Tversky,  1977)  at  different  levels  of 
aggregation  and  abstraction  (Rumelhart,  1980). 

Schema  can  support  reasoning  by  analogy  by  helping  people  to  recognize 
that  a  particular  situation  is  related  to  a  class  of  situations.  In  reasoning 
by  analogy,  methods  proven  to  work  for  one  class  of  problems  are  applied  to  a 
new  class  (diSessa,  1983). 

It  is  not  yet  understood  how  schema  are  formed  (Rumelhart,  1977),  but 
since  schema  represent  the  results  of  experience,  schema  must  somehow  be 
generalized  from  a  sequence  of  past  experiences.  One  model  (Hayes-Roth, 

1977;  Elio,  1981),  based  on  feature  powersets  of  examplars,  proposes  that  each 
time  a  new  experience  is  encountered  which  is  similar  to  one  for  which  a  schema 
exists,  the  elements  of  the  property  set  of  the  new  experience  augment  in  memory 
those  property  sets  which  have  been  previously  stored.  Our  research  is  also 
based  on  the  assumption  that  schema  are  developed  from  past  experiences. 


Accordingly,  subject  training,  which  is  designed  to  help  people  quickly  acquire 
schema,  is  based  on  presentation  of  examples. 

The  research  described  here  builds  upon  many  previous  concepts,  but  is 
most  directly  related  to  the  work  of  Zimmerman  and  Zysno  (1980)  which  applies 
fuzzy  set  concepts  to  decision  making.  In  that  study,  subjects  rated  the 
quality  of  features  (fit  and  strength,  as  inferred  from  shape  and  color)  of  each 
tile  to  be  used  in  a  furnace,  and  separately  rated  the  overall  quality  of  that 
tile.  Zysno  and  Zimmerman  noted  that  the  assessments  of  overall  tile  quality 
could  be  estimated  from  the  geometric  mean  of  the  subjects'  estimates  of  tile 
color  and  shape. 

Schema-based  information  processing  is  described  in  the  literature  for  a 
broad  range  of  tasks.  Some  of  the  researchers  describe  general  principles  of 
schema  structure  and  operation  that  are  applicable  to  many  problems.  Others 
propose  a  specific  structure  and  information  processing  sequence  within  the  con¬ 
text  of  a  particular  problem.  This  paper  is  of  the  latter  type.  It  describes  a 
specific  information  processing  model  for  situation  assessment. 

This  model  assumes  that  situation  assessment  occurs  by  comparing  an 
observed  situation  with  memory  reference  models  for  different  situation  types 
*nd  by  associating  the  observed  situation  with  the  reference  model  that  it 
matches  best.  This  model  "represents  knowledge  in  the  kind  of  flexible  way 
which  reflects  human  tolerance  for  ....  imprecision"  (Rumelhart,  1977),  thereby 
enabling  it  to  accommodate  inexact  matches  between  observed  and  reference 
situations.  While  the  model  draws  on  the  current  literature  and  is  consistent 
with  the  data  in  this  literature,  the  specific  model  proposed  here  is  believed 


The  information  processing  model 

The  human  information  processing  model  presented  here  is  comprised  of  an 
information  processing  structure  and  information  processing  steps  that  relate 
the  properties  of  presented  information  assessments  made  about  that  situation. 

In  the  experiments  to  be  described,  the  situations  to  be  evaluated  are 
"all-out  attacks"  or  "barriers".  The  information  processing  model  describes  the 
specific  steps  through  which  presented  information  results  in  an  assessment  of 
attack  or  barrier  quality.  An  example  of  the  presented  information  is  shown  in 
Figure  1.  This  particular  picture  was  one  of  several  used  for  training.  The 
test  pictures  are  similar,  but  do  not  include  text  information  about  attack 
quality.  In  this  figure  the  friendly  forces  (white)  are  positioned  in  the 
center  of  the  picture.  They  are  surrounded  by  hostile  (black)  ships,  sub¬ 
marines,  and  aircraft.  The  information  processing  model  accounts  for  subjects' 
assessments  of  the  effectiveness  of  this  attack  in  terms  of  the  attack  features. 

Information  processing  structure 

The  information  processing  structure  consists  of  a  network  of  linked 
schema.  The  hypothesized  structure  for  evaluating  the  quality  of  an  all-out 
attack  consists  of  four  primary  schema:  one  each  for  the  surface,  air,  and  sub¬ 
marines  threats,  and  one  for  the  overall  attack. 

Each  schema  (Figure  2)  consists  of  three  layers:  a  slot  layer,  a 
criteria  layer,  and  an  inference  and  action  layer.  Each  schema  can  be  thought 
of  as  a  decision  making  mechanism,  with  each  layer  corresponding  to  a  step  in 
the  decision  process:  the  slot  layer  corresponds  to  problem  formulation;  the 
criteria  layer  to  problem  analysis;  and  the  inference  and  action  layer  to  alter¬ 
native  selection. 


Figure  1.  An  example  of  a  training  picture  for  all-out  attacks  in  experiment  1. 

Subjects  were  told  that  "Attack  effectiveness  is  4.  The  air  threat 
is  severe,  but  the  ship  and  sub  threats  are  weak.  There  are  too  few 
ships,  and  the  submarines  are  concentrated  in  only  a  single  quadrant. 
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The  slot  layer  specifies  a  set  of  slots  used  for  identifying  situation 
features  that  are  relevant  to  the  schema.  Each  slot  specifies  the  physical  and 
functional  properties  of  potential  slot  fillers.  The  schema  for  the  submarine 
threat  contains  a  slot  for  the  feature  "many  submarines"  and  a  second  slot  for 
the  feature  "multi-axis  threat".  The  schema  for  the  overall  attack  contains 
slots  for  the  overall  surface,  subsurface,  and  air  threats. 

The  second  layer  contains  data  for  feature  assessment.  Our  model  pre¬ 
sents  these  data  as  feature  criteria  curves  and  weighting  rules.  The  criteria 
curves  convert  measurable  picture  properties,  such  as  the  number  of  aircraft  or 
the  distance  between  ships,  into  subjective  feature  assessments  such  as  "many 
aircraft"  or  "barrier  length".  In  our  experiments  the  feature  assessments 
measure  the  degree  to  which  features  have  characteristics  consistent  with  a  high 
quality  hostile  attack  or  barrier.  In  the  schema  for  the  surface  threat,  for 
example,  the  criteria  curve  for  the  feature  "many  ships"  defines  the  extent  to 
which  any  particular  number  of  ships  qualifies  as  being  "many  ships"  in  the  spe¬ 
cific  context  of  an  all-out  attack.  These  criteria  curves  may  be  interpreted  as 
fuzzy  set  membership  functions  in  the  set  "many  ships".  In  these  experiments, 
subjective  feature  assessment  scores  range  from  one  to  ten  with  a  score  of  ten 
indicating  a  feature  chacteristic  of  a  very  strong  attack  or  barrier  and  a  score 
of  one  indicating  a  feature  characteristic  of  a  very  weak  attack  or  barrier. 

For  the  attacks  in  our  first  experiment,  a  picture  with  only  a  single  ship  would 
score  about  a  one  on  the  feature  "many  ships";  one  with  seven  ships  would  score 
about  ten  on  this  feature. 

The  second  layer  also  contains  a  rule  for  assigning  feature  weights. 
These  weights  reflect  the  relative  importance  of  each  feature  in  assessing 
overall  attack  or  barrier  effectiveness.  Examples  of  weighting  rules  could 
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include  "assign  equal  weights",  "assign  each  feature  a  weight  equal  to  its 
feature  assessment  score",  or  for  a  two  feature  schema  "weight  the  higher 
feature  .25  and  the  lower  feature  .75". 

The  third  schema  layer  specifies  the  actions  to  be  taken  and  inferences 
to  be  made  given  various  levels  of  schema  activation.  Such  actions  and  inferen¬ 
ces  are  retrieved  as  if  by  a  table  look-up  within  the  schema.  Examples  of  actions 
to  be  taken  are  "attempt  to  pass  through  barriers  of  quality  less  than  "5"  (as 
rated  by  the  schema)"  and  "do  not  pass  through  barriers  rated  of  higher  quality". 
This  level  plays  no  role  in  the  present  situation  assessment  model,  and  is  not 
examined  in  these  experiments.  The  inference  and  action  level  is  expected  to  be 
important  in  models  that  address  decision  making  based  on  situation  assessment. 

Schema  acquistion 

People  are  assumed  to  acquire  schema  by  abstracting  (usally  sub¬ 
consciously)  a  general  model  from  specific  instances.  In  acquiring  the  schema 
described  previously,  people  must  identify  1)  a  set  of  situation  features 
corresponding  to  the  schema  slots,  2)  a  set  of  feature  criteria  curves,  and  3)  a 
feature  weight  assignment  rule.  In  the  present  experiments,  subjects  acquired 
schema  by  being  shown  a  set  of  examples,  each  associated  with  an  attack  or 
barrier  effectiveness  score  and  a  qualitative  statement  about  the  strength  of 
individual  features.  Subjects  were  not  told  the  feature  criteria  curves  or 
feature  weighting  rules,  but  were  expected  to  infer  these  as  they  acquired  the 
schema. 


Figure  3  summarizes  the  overall  schema  acquisition  environment  in  the 
experiments.  The  experimenters  developed  a  "schema-like"  situation  assessment 
model  containing  feature  assessment  curves  and  a  weighting  rule  for  a  specified 
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igure  3.  Matenal  Preparation,  training  and  Proposed  schema  Acquisition 
All-out  Attack  and  Barrier  Experiments 


set  of  features.  Subjects  were  then  trained  by  being  shown  examples  constructed 
from  this  model.  It  is  proposed  that  the  subjects  develop  the  schema  by 
abstracting  the  relevant  features,  feature  criteria  curves,  and  feature 
weighting  rule  from  these  examples. 

Information  processing  steps 

Figure  4  outlines  seven  information  processing  steps  proposed  to 
account  for  subjects'  ratings  of  situation  quality.  These  steps  process 
information  derived  from  the  presented  pictures  by  using  reference  data  that  is 
developed  during  training  and  stored  within  the  network  of  schema. 


1.  Initial  selection  of  schema.  Presentation  of  a  task  will  cause  selection  of 
schema  related  to  the  task.  Thus,  the  task  "evaluate  the  following  all-out 
attacks"  will  cause  schema  related  to  attack  evaluation  to  be  made  available. 
This  step  is  not  examined  in  these  experiments,  and  is  not  discussed  further. 

2.  Object  classification.  Subjects  classify  the  familiar  objects  within  each 
picture  of  an  attack  or  barrier.  It  is  at  this  point  that  ships  are  recognized 
as  ships  and  not  as  blotches  caused  by  a  dirty  copying  machine.  The  present 
experiments  do  not  examine  how  objects  are  classified,  and  it  is  not  discussed 
further. 

3.  Assessment  of  feature  relevance  and  functional  substitution.  This  step  exa¬ 
mines  the  objects  and  relationships  among  objects  classified  in  the  previous 
step,  attempts  to  find  relevant  features,  and  converts  relevant  features  into 
standard  physical  units.  This  step  uses  information  in  the  slot  layer  of  the 
schema  which  specifies  functional  and  physical  properties  of  objects  relevant  to 
the  schema.  In  this  step  all  objects  able  to  fill  a  schema  slot  are  converted 
into  the  standard  phys.cal  units  used  by  the  schema  criteria  curves  for  feature 


Figure  4.  Use  of  Schema  for  Situation  Assessment 


assessment.  In  the  experiment  3  barrier  evaluations,  islands  are  converted  into 
ship  equivalents  in  this  step. 

4.  Feature  assessment.  In  this  step,  the  physical  units  filling  the  feature 
slots  are  converted  into  schema-specific  feature  assessment  scores.  In  experi¬ 
ment  1,  for  example,  the  feature  "many  aircraft"  would  receive  a  score  of  about 
seven  in  any  attack  having  eight  aircraft.  This  and  the  following  two  steps  use 
data  stored  in  the  criteria  layer. 

5.  Feature  weighting.  Each  scored  feature  is  assigned  a  weight  generated  by 
the  schema  weight  assignment  rule. 

6.  Feature  combining.  Features  are  combined  using  some  weighting  scheme  for 
the  assessed  and  scored  features.  The  geometric  mean  was  used  in  the  experi¬ 
ments  reported  here  and  it  worked  well,  but  the  specific  combination  rule  used 
is  not  important  to  the  model.  A  weighted  arithmetic  mean  would  probably  work 
about  as  well.  The  geometric  rather  than  arithmetic  mean  was  selected  for  this 
model  because  the  geometric  mean  allows  a  single  situation  feature,  which  is 
completely  inconsistent  with  a  particular  schema,  to  prevent  that  schema  from 
being  used  as  the  situation  model. 

7.  Iteration  of  steps  five  and  six  at  higher  schema  levels.  In  the  all-out 
attack  example,  steps  four  through  six  assessed,  weighted  and  combined  three 
pairs  of  features:  many  ships  and  ship  multi-axis  assessed  and  combined  into 
overall  ship  threat;  many  aircraft  and  aircraft  multi-axis  assessed  and  combined 
into  overall  air  threat;  and  many  submarines  and  submarine  multi-axis  assessed 
and  combined  into  overall  submarine  threat.  In  the  iteration  of  steps  five  and 
six  at  a  higher  level,  the  overall  ship,  air,  and  submarine  threats  will  be 


weighted  and  combined.  The  result  of  this  feature  combination  is  the  score  for 
the  attack  quality.  This  score  is  the  overall  assessment  of  the  attack. 

Critical  model  issues 

Experiments  1  through  3  test  the  the  adequacy  of  the  proposed  model  for 
mapping  the  connection  between  presented  information  and  assessed  all-out  attack 
and  barrier  quality.  These  tests  address  specific  information  processing  issues 
in  the  third  through  seventh  steps  described  above;  they  also  address  the  abi¬ 
lity  of  subjects  to  acquire  stable  and  accurate  schema  from  a  sequence  of 
examples. 

Stability,  accuracy,  and  ease  of  learning  of  schema  abstracted  from 
examples.  The  training  is  intended  to  install  schema  for  all-out  attacks  and 
barriers  in  a  way  that  is  consistent  with  the  natural  acquisition  of  schema 
through  everyday  experiences.  In  this  training,  subjects  are  shown  ten  to 
twelve  examples  of  attacks  or  barriers.  For  each  example,  they  are  given  a 
numerical  rating  for  the  quality  of  the  barrier  or  attack,  and  told  the  features 
that  contribute  to  its  strength  and  weakness.  They  are  not  given  numbers  for 
feature  strength,  nor  are  they  told  the  relative  importance  of  the  different 
features. 

The  experiments  test  the  ease  with  which  schema  can  be  learned  from 
information  presented  this  way.  They  test  the  "accuracy"  of  the  subjects' 
schema,  as  measured  by  the  extent  to  which  the  subjects'  assessments  match  the 
assessments  predicted  by  the  schema-like  model  used  to  develop  the  training  pic¬ 
tures.  In  addition  they  test  the  "stability"  of  schema,  as  measured  by  the  con¬ 
sistency  of  subjects'  assessments  over  time.  Data  for  ease  of  learning, 
stability,  and  accuracy  test  whether  subjects  attain  and  use  a  relatively  per- 
manement  cognitive  model  for  their  assessments. 
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Ease  of  learning  is  measured  by  the  time  it  takes  for  a  subject  to  learn 
to  rate  a  set  of  training  pictures  consistently  (within  one  point  of  a  standard 
rating  for  that  picture).  It  is  hypothesized  that  if  situation  evaluation  is 
naturally  mediated  by  data  organized  within  schema  having  the  structure 
described  previously,  then  training  material'  designed  to  fit  these  structures 
should  be  easy  to  learn.  If  nearly  all  subjects  can  learn  the  material  within 
two  or  three  presentations  of  the  set,  then  it  seems  likely  that  the  training  is 
taking  advantage  of  readily  developed  cognitive  structures. 

Stability  implies  that  people  are  basing  their  assessments  on  a  cognitive 
model  that  is  not  changing  through  the  duration  of  the  experiment.  In  the  all- 
out  attack  experiments  subjects  are  presented  with  each  test  picture  twice, 
separated  by  about  an  hour.  During  this  hour  the  subjects  first  performed  a 
distraction  task,  and  then  rated  features  in  pictures  of  all-out  attacks. 

Because  the  number  of  test  pictures  exceeds  the  capacity  of  episodic  memory,  and 
because  the  time  interval  between  successive  estimates  of  the  same  attacks 
exceeds  item  retention  in  episodic  memory,  stability  also  implies  that  this 
cognitive  model  is  in  semantic  memory. 

Like  ease  of  learning  and  stability,  accuracy  implies  the  use  of  schema 
for  the  subjects'  assessments.  In  these  experiments,  the  training  picture 
ratings  were  derived  using  a  model  that  computes  barrier  or  all-out  attack 
quality  from  the  feature  characteristics.  An  accurate  schema  captures  this 
model.  It  enables  subjects  to  rate  each  picture  approximately  the  same  as  the 
model  would. 

In  these  experiments  subjects'  assessments  are  compared  to  the  ratings 
produced  by  the  model.  Accurate  assessments  of  attacks  or  barriers  not  seen  in 


the  training  implies  that  these  assessments  are  based  on  the  model.  This  impli¬ 
cation  is  tested  directly  in  the  barrier  experiments.  The  ten  test  pictures  in 
this  experiment  included  five  from  the  training  set  and  five  new  pictures.  If 
the  subjects'  assessments  of  situation  quality  for  the  new  pictures  were  as 
close  to  the  standard  as  were  their  assessments  for  the  pictures  seen  previously 
in  the  training,  then  it  may  be  concluded  that  the  pictures  are  being  evaluated 
using  schema  rather  than  by  remembering  the  actual  pictures  presented  in 
training. 

Assessment  of  feature  relevance  and  functional  substitution.  (Step  3  in 
the  information  processing  model.  The  first  and  second  steps  are  not  examined 
in  these  experiments.)  Data  in  schema  enable  relevant  features  to  be  identified 
and  used  for  assessment.  These  data  should  enable  people  to  identify  and  use 
features  composed  of  objects  physically  different  from,  but  functionally  equiva¬ 
lent  to,  the  objects  included  in  the  training  materials.  Functional  substitu¬ 
tion  is  this  ability  to  use  functionally  equivalent  objects  in  the  schema-based 
assessments. 

Two  different  mechanisms  for  functional  substitution  seem  plausible. 

The  first  possibility  proposed  was  that  the  functional  substitution  occurs  very 
early  in  the  processing.  Objects  such  as  ships  or  islands  are  first  classified. 
Their  relevance  to  the  schema  is  then  determined  by  comparing  the  physical  and 
functional  properties  of  the  objects  with  the  properties  specified  by  the  schema 
slots.  If  an  object  is  determined  to  be  relevant,  then  it  will  be  used  in  the 
sitution  evaluation.  Since  the  feature  criteria  curves  used  in  step  four  are 
unlikely  to  have  been  developed  for  objects  not  included  in  the  training 
material,  a  mechanism  must  exist  to  enable  existing  feature  criteria  curves  to 


accommodate  these  objects.  It  is  proposed  that  this  mechanism  is  to  substi¬ 
tute  a  functionally  equivalent  number  of  old  objects  for  the  new  objects,  and 
then  to  use  the  existing  criteria  curves  with  the  old  objects.  Thus,  in  the 
island  part  of  experiment  3,  an  island  would  be  converted  into  a  certain,  func¬ 
tionally  equivalent,  number  of  ships,  and  then  the  criteria  curve  developed  to 
evaluate  ships  is  used  to  evaluate  the  effect  of  the  islands. 

A  second  possibility  for  functional  substitution  is  that  it  occurs  later 
in  the  evaluation  process.  In  this  case,  the  schema  will  cause  each  object  to 
be  evaluated  according  to  the  physical  criteria  developed  in  the  training 
material.  For  example,  the  feature  "barrier  length"  might  be  judged  from  the 
distance  between  the  two  ships  at  the  ends  of  the  barrier.  A  barrier  with  the 
two  end  ships  far  apart  would  be  evaluated  high  on  this  feature;  one  with  the 
two  end  ships  near  together  would  be  evaluated  low.  If  functional  substitution 
occurs  late  in  the  process,  then  a  barrier  with  two  ships  near  one  another  would 
initially  be  rated  the  same  on  the  "barrier  length"  feature  whether  or  not  there 
exist  other  objects  in  the  picture  that  function  as  blockading  ships  beyond  the 
two  end  ships.  Thus,  a  short  barrier  completely  blocking  a  channel  inlet  would 
score  low.  The  late  substitution  alternative  proposes  that  low  scoring  pictures 
with  unusual  objects  would  be  re-evaluated.  This  second  evaluation  would  not 
use  the  physical  properties  of  the  objects  in  the  picture  (ship  distances,  for 
example)  but  rather  would  use  the  functional  properties  of  these  objects 
(ability  to  block  passage). 

Both  of  these  alternatives  for  functional  substitution  seem  plausible, 
and  each  offers  some  advantages  in  information  processing.  The  former,  with 
early  functional  substitution,  does  not  require  that  pictures  that  score  poorly 
be  reevaluated  using  a  second  set  of  features  concerned  with  functionality.  The 


latter,  which  uses  functional  properties  only  when  necessary,  allows  evaluations 
for  most  cases  to  be  made  without  requiring  that  object  functionality  be  con¬ 
sidered  at  all. 

Experiment  3  is  designed  to  discriminate  between  these  two  alternatives. 
It  presents  pictures  with  objects  not  seen  in  the  training  pictures,  and  elicits 
overall  s.ores  and  scores  for  "physical"  (distance  between  ships)  and 
"functional"  (barrier  is  hard  to  go  around)  features.  If  the  subjects'  overall 
evaluation  can  be  predicted  only  from  the  functional  features,  and  not  from  the 
physical  features,  then  the  second  alternative  proposing  separate  functional 
and  physical  tests  is  supported.  If  the  subjects'  overall  evaluation  is  pre¬ 
dicted  equally  well  from  the  physical  or  functional  features,  then  the  first 
alternative,  early  functional  substitution,  would  be  favored. 

Feature  assessment.  (Step  4  in  the  model).  In  the  feature  assessment 
step,  physical  features,  which  are  measureable  quantities,  such  as  the  number  of 
ships  in  a  picture  or  the  distance  between  two  ships,  are  converted  to  a  related 
subjective  assessments,  such  as  "many  ships"  or  "barrier  length".  These 
assessments  are  schema  specific,  and  actually  mean  "many  ships  for  the  purpose 
of  all-out  attacks",  or  "barrier  length  sufficient  for  barrier  to  be  effective". 

The  model  proposes  that  feature  critieria  curves  are  used  to  assess  and 
score  features  as  needed  for  situation  assessment.  If  this  is  the  case,  then 
it  is  expected  that  the  feature  score  would  be  related  to  an  underlying  physical 
variable  in  a  simple  monotonic  way  and  that  this  relationship  would  not  depend 
on  the  values  of  other  features  in  the  picture.  It  is  also  expected  that  the 
criteria  curve,  being  schema  specific,  would  closely  reflect  the  training 
materi als. 


All  experiments  address  the  use  of  feature  criteria  curves.  Experiments 
1  and  2,  however,  are  designed  to  examine  these  issues  critically.  These  two 
experiments  differ  only  in  the  number  of  objects  in  each  training  and  test  pic¬ 
ture.  Every  picture  in  experiment  2  has  50%  more  hostile  ships,  submarines,  and 
aircraft  than  the  corresponding  picture  in  experiment  1. 

It  is  expected  that  if  the  criteria  curves  are  determined  entirely  by 
the  training  pictures,  then  the  curves  from  the  two  experiments  would  differ 
only  by  a  50%  scaling  factor.  Thus,  if  six  ships  in  experiment  1  receives  a 
score  of  7  for  the  feature  "many  ships",  then  nine  ships  in  experiment  2  would 
receive  a  score  of  7  for  that  feature.  If  this  relationship  is  not  true,  then 
the  curves  must  be  determined  both  by  the  training  material  and  also  by  general 
concepts  related  to  feature  evaluations.  For  example,  nine  ships  in  the  second 
experiment  might  be  scored  higher  than  six  ships  in  the  first  experiment  because 
nine  scores  higher  in  the  general  category  "many"  than  does  six. 

During  training  subjects  were  never  given  any  numerical  ratings  for 
feature  quality.  Instead  they  were  given  only  the  overall  picture  rating  and 
qualitative  feature  assessments.  Because  of  this  and  the  fact  that  there  are 
many  different  ways  to  combine  two  feature  scores  to  yield  an  overall  rating,  it 
is  not  expected  that  the  feature  criteria  curves  inferred  by  the  subjects  would 
match  the  criteria  curves  used  in  the  model  to  develop  the  training  and  test 
materials.  These  experiments  offer  an  opportunity  to  observe  discrepencies  bet¬ 
ween  the  model  criteria  curves  and  the  curves  inferred  by  subjects. 

Feature  weighting  (step  5  in  the  model).  The  model  proposes  that 
overall  picture  assessments  are  the  weighted  geometric  mean  of  the  feature 
scores.  It  is  expected  that  in  each  picture  some  features  contribute  more  to 
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the  overall  assessment  than  do  other  features.  For  example,  in  these  experi¬ 
ments  barrier  quality  depends  only  on  two  features:  barrier  length  and  barrier 
solidity.  In  the  model  used  to  develop  the  experiment  materials  overall  barrier 
quality  depended  primarily  on  the  quality  of  the  weaker  feature.  For  instance, 
a  barrier  that  is  very  long  but  has  big  holes  would  be  rated  low  because  it  is 
easy  to  pass  through.  On  the  other  hand,  it  is  expected  that  a  barrier  that  is 
very  solid  but  quite  short  would  also  be  rated  low,  because  it  is  easy  to  pass 
around. 

The  experiments  test  several  issues  concerned  with  weighting.  They  test 
whether  or  not  people  do  weight  features  differently  for  different  examples  of 
all-out  attack  or  barrier.  They  test  whether  these  weightings  can  be  derived 
from  a  simple  weighting  rule  for  all  pictures  corresponding  to  a  single  schema, 
and  whether  different  schema  have  different  rules. 

There  are  two  different  ways  for  the  experimenters  to  infer  subjects' 
feature  weights.  One  way  is  from  the  feature  importance  ratings  provided  by  the 
subjects.  If  the  feature  weights  are  the  same  as  the  importance  scores,  then 
feature  weights  can  be  obtained  directly  from  these  ratings.  Feature  weights 
can  also  be  attained  indirectly,  however,  by  finding  a  weight  assignment  rule 
that  produces  weighted  geometric  means  close  to  the  picture  ratings.  By  com¬ 
paring  the  weights  obtained  by  these  two  methods  it  is  possible  to  determine  how 
importance  ratings  relate  to  feature  weights. 

Feature  combination,  (step  6  in  model).  The  key  prediction  of  this 
model  is  that  the  weighted  geometric  means  of  subjects'  feature  assessments 
approximate  their  assessments  of  the  overall  attack  or  barrier  quality.  This 
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prediction  is  tested  directly  in  each  of  the  three  experiments.  Poor  correla¬ 
tion  between  these  weighted  means  and  the  assessments  would  invalidate  the 
model.  Good  correlation  would  suggest  its  utility  for  modeling  the  information 
processing  for  situation  assessment. 


Experiments 


In  this  research  program,  three  related  experiments  were  conducted  to 
address  the  issues  suggested  by  our  model  of  schema-based  information  pro¬ 
cessing.  All  three  experiments  provided  data  to  test  our  hypotheses  regarding 
the  relationship  between  situation  assessment  and  feature  assessment,  feature 
weighting  and  feature  combination.  These  data  also  address  the  extent  to  which 
subjects  are  able  to  infer  and  use  the  model  used  to  develop  the  experimental 
materials.  In  addition,  each  experiment  provided  some  data  on  particular 
aspects  of  the  model . 

Experiment  1:  All-Out  Attack,  Low  Density 

This  experiment  was  designed  to  test  several  properties  of  schema  as 
used  for  situation  assessment.  Specifically,  this  experiment  provides  infor¬ 
mation  about  criteria  curves  for  feature  assessment  and  scoring,  about  rules  for 
feature  weighting,  and  about  the  relationship  between  weighted  features  and 
overall  effectiveness  ratings.  The  experiment  also  provides  data  for  comparing 
the  subjects'  curves  with  the  curves  extracted  from  our  expert^  and  used  to 
develop  the  training  and  test  pictures.  In  addition,  this  experiment  assesses 
the  stability  of  the  schema  generated  through  the  training  procedure. 

Methods 

Materials.  The  materials  for  this  experiment  consisted  of  a  set  of  12 
training  pictures,  10  test  pictures,  a  set  of  feature  evaluation  sheets,  and  the 
Raven  Progressive  Matrices  Test  (1958),  which  was  used  as  a  distractor  task. 
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A  retired  Navy  commander 


The  training  and  test  pictures  illustrated  threats  capable  of  mounting 
all-out  attacks  of  different  effectiveness.  These  pictures  were  developed  using 
a  schema-like  model  of  attack  effectiveness.  This  model  represented  an  expert's 
schema,  and  was  developed  by  working  with  this  expert.  In  developing  this  model, 
a  set  of  pictures  of  all-out  attacks  and  a  set  of  features  thought  to  be  the 
basis  of  the  effectiveness  ratings  for  the  attacks  were  developed.  The  expert 
was  then  asked  to  rate  1)  the  overall  effectiveness  of  the  attack  shown  in  each 
picture,  and  2)  how  characteristic  each  feature  in  the  attack  is  of  a  very 
effective  attack.  The  feature  criteria  curves  that  the  expert  was  using  were 
determined  from  these  ratings;  that  is,  the  relationships  between  the  physical 
properties  of  the  picture  (e.g.,  number  of  ships,  submarines)  and  the  subjective 
ratings  for  those  features  (e.g.,  many  ships,  submarines)  was  plotted.  These 
curves  were  used  to  generate  a  new  set  of  pictures.  For  each  of  these  new  pic¬ 
tures  the  overall  assessment  expected  from  the  expert  was  predicted  by  con¬ 
verting  the  physical  properties  of  the  picture  into  subjective  feature  ratings 
and  combining  them  using  the  geometric  mean  rule.  The  expert  was  again  asked  to 
rate  the  overall  effectiveness  of  the  picture.  When  discrepancies  occurred  bet¬ 
ween  the  predicted  and  actual  ratings,  the  expert  was  queried  for  the  cause  of 
the  discrepancies.  The  set  of  features  and  feature  criteria  curves  were  then 
modified  based  on  that  feedback.  The  entire  process  was  repeated  until  the 
overall  assessments  given  by  the  expert  were  predicted  accurately  from  a 
weighted  combination  of  the  feature  scores  as  calculated  from  our  previously 
extracted  feature  criteria  curves.  It  is  important  to  note  here  that  the  who- 
listic  preferences  of  the  expert  were  never  subject  to  question  --  in  each  case 
the  weights  and  criteria  were  changed,  or  identified,  such  that  they  matched 
these.  Through  this  process  three  main  features  were  finally  identified  as 
being  important  in  determining  the  effectiveness  of  an  all-out  attack.  These 


were  the  overall  ship  strength,  overall  submarine  strength,  and  the  overall 
aircraft  strength.  For  each  of  these  features,  the  overall  strength  was  a 
function  of  the  number  of  platforms  and  of  the  number  directions  from  which  the 
platforms  were  able  to  attack.  Any  given  picture  depicts  each  of  these 
features;  each  feature  could  be  rated  on  a  scale  of  one  to  ten  in  terms  of  how 
characteristic  it  is  of  an  effective  attack.  The  overall  effectiveness  of  the 
attack  results  from  a  weighted  geometric  mean  of  the  individual  feature  ratings. 
The  feature  criteria  curves  used  to  develop  the  materials  for  the  low  density 
attack  (experiment  1)  are  shown  in  Table  A-l  in  Appendix  1. 

The  training  and  test  pictures  developed  from  this  model  illustrated 
the  full  range  of  possible  attack  effectiveness.  Each  picture  was  generated  by 
choosing  a  level  (high,  medium,  or  low)  for  each  feature  and  creating  an  attack 
representing  these  feature  levels.  The  set  of  pictures  was  generated  by  varying 
the  levels  in  a  systematic  way.  The  set  of  pictures  included  some  where  all  the 
features  were  rated  low,  some  where  all  the  features  were  rated  high,  and  some 
where  the  features  represented  a  range  from  low  to  high.  This  process  yielded  a 
set  of  training  and  test  pictures  that  had  predicted  overall  effectiveness 
ratings  throughout  the  full  range,  from  one  to  ten.  An  example  of  a  training 
picture,  with  its  effectiveness  rating  and  explanation  for  the  rating,  is  shown 
in  Figure  1.  Table  A-2  shows  the  design  criteria  for  each  test  picture  in  the 
all-out  attack  experiment. 

The  feature  evaluation  sheets  contained  a  list  of  the  features  relevant 
to  determining  the  effectiveness  of  an  all-out  attack.  For  this  feature,  there 
were  blanks  to  be  completed  regarding  a)  the  extent  to  which  each  feature  in 
the  accompanying  picture  is  characteristic  of  a  very  good  attack  (score  10),  a 
very  poor  attack  (score  1)  or  an  intermediate  attack  (intermediate  score);  b) 


i 
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how  important  that  feature  would  be  to  an  overall  estimate  of  the  effectiveness 
of  the  threat,  and  c)  how  confident  the  subjects  were  of  the  ratings  they  had 
just  assigned  for  that  feature.  Each  feature  evaluation  sheet  was  accompanied 
by  one  of  the  test  pictures. 

Procedure.  The  experiment  began  with  a  training  session.  In  this 
session,  subjects  were  provided  with  background  material  explaining  the  basic 
Battle  Group  scenario  with  which  they  would  be  working.  Subjects  were  trained 
to  recognize  signs  of  all-out  attacks  by  first  instructing  them  on  the  signs  of 
an  impending  attack  and  then  by  showing  them  six  examples  of  all-out  attacks. 
They  were  told  how  good  each  picture  had  been  rated  by  an  expert  using  a 
10-point  scale.  They  were  also  told  which  situation  features  were  responsible 
for  each  example's  rating.  (See  Figure  1  for  an  example.)  Following  this,  they 
were  shown  an  additional  six  pictures  and  asked  to  predict  what  each  picture's 
rating  would  be.  The  actual  expert's  rating  for  each  picture  was  then  pre¬ 
sented,  along  with  an  explanation  of  which  situation  features  were  responsible 
for  the  rating.  After  they  had  seen  all  twelve  training  pictures,  they  were 
asked  to  go  back  through  all  twelve  pictures  and  predict  the  expert's  rating 
until  they  could  predict  9  of  the  12  scores  within  one  point  on  one  pass  through 
the  pictures.  Data  from  subjects  who  could  not  achieve  this  level  of  perfor¬ 
mance  after  three  trials  were  not  used  in  further  analyses. 

After  the  training  session,  subjects  were  shown  ten  test  pictures. 

For  each  picture,  they  were  asked  to  rate  how  effective  the  all-out  attack  pic¬ 
tured  was  and  how  confident  they  were  that  their  effectiveness  rating  would 
match  our  expert's  rating  within  one  point.  Each  of  these  judgments  was  made 
on  a  10-point  scale. 


When  the  ratings  had  been  completed,  subjects  were  asked  to  work  on  a 


series  of  puzzles,  which  were  designed  to  serve  as  a  distractor  task.  After 
working  on  these  puzzles  for  twenty  minutes,  the  subjects  were  given  the  feature 
rating  sheets  for  the  ten  test  pictures.  For  each  feature  in  each  picture,  they 
were  asked  to  rate  a)  to  what  extent  the  feature  shown  in  the  picture  was 
characteristic  of  a  very  good  attack;  b)  how  important  the  feature  would  be  to 
an  overall  assessment  of  the  effectiveness  of  an  all-out  attack;  and  c)  how  con¬ 
fident  they  were  of  their  ratings. 

After  all  the  feature  sheets  had  been  completed,  the  subjects  were 
asked  to  make  a  new  set  of  effectiveness  and  confidence  ratings  for  the  set  of 
ten  test  pictures. 

Subjects.  The  subjects  were  25  undergraduate  students  at  George  Mason 
University  in  Fairfax,  Virginia.  Five  of  these  subjects  were  unable  to 
accurately  predict  the  overall  effectiveness  ratings  for  the  training  pictures 
after  three  trials;  their  data  were  not  analyzed  further.  The  students  received 
either  course  credit  or  payment  for  their  participation  in  the  study. 

Results  and  Discussion 

Our  model  assumes  that  schema  are  stable  structures  which  are  easily 
developed  abstractions  of  examples.  The  data  support  this  hypothesis.  The 
mean  number  of  trials  required  to  reach  criterion  on  the  practice  materials  was 
1.72,  which  suggests  that  the  schema  are  easily  learned,  although  there  were 
five  participants  in  this  experiment  who  did  not  reach  criterion  by  the  third 
trial . 

The  stability  of  the  schema  can  be  seen  in  the  stability  of  the  overall 
effectiveness  ratings,  which  were  made  independently,  60  minutes  apart.  These 


ratings  averaged  over  subjects  can  be  seen  in  Table  1.  The  test-retest  correla¬ 
tion  for  these  ratings  is  extremely  high  (r  (8)  =  0.986,  p  <  .01),  suggesting 
that  the  average  rating  for  each  picture  is  consistent  over  the  60-minute  time 
span  between  the  initial  and  final  ratings.  The  stability  of  the  ratings  within 
individuals  is  also  remarkable.  Table  2  shows  the  frequency  distribution  of 
the  difference  between  the  first  an  •  second  estimates  of  overall  effectiveness 
for  the  first  two  experiments.  Roughly  one-third  of  the  responses  were  iden¬ 
tical  on  the  two  trials,  71%  of  the  ratings  on  the  second  trial  were  within  plus 
or  minus  one  of  the  original  rating;  and  88%  of  the  ratings  on  the  second  trial 
were  within  two  points  of  the  original  response. 

Another  aspect  of  the  data  concerned  the  development  of  feature  cri¬ 
teria  curves  for  feature  assessment  and  scoring.  These  curves  relate  the  physi¬ 
cal  features  of  the  picture  (e.g.,  the  number  of  ships)  to  the  subjective 
ratings  for  that  feature.  Figure  5  shows  the  observed  relationships  between  the 
actual  number  of  platforms  (ships,  aircraft,  and  submarines)  and  the  features 
"many  ships/aircraft/submarines."  Also  shown  are  the  curves  used  in  the  model 
to  generate  the  experiment  materials  (labeled  "target  rating").  The  curves 
monotonical ly  increasing,  (except  for  one  point),  with  a  consistent  underestima¬ 
tion  of  the  number  of  platforms  relative  to  the  curve  used  in  the  model. 

The  model  also  predicts  that  subjects'  overall  attack  effectiveness 
ratings  are  approximated  by  the  weighted  geometric  mean  of  the  individual 
feature  scores.  We  proposed  that  the  weight  assigned  to  each  feature  will  be 
related  to  that  feature's  importance  rating  and  that  the  geometric  mean  of  the 
features  so  weighted  will  predict  overall  attack  effectiveness  more  accurately 
than  will  the  geometric  means  weighted  in  other  ways.  If  these  properties  are 
true,  then  a)  the  correlation  between  the  overall  assessments  and  the  geometric 
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TABLE  1.  Average  all-out  attack  effectiveness  ratings  for  first  and 
second  evaluations  of  all-out  attacks,  low  density  case  Expt 
1  and  high  density  case  Expt  2. 
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Figure  5b.  Feature  Scaling  Curves  Relating  Subjects'  Ratings  "Many  Aircraft"  To  Number  Of  Aircraft  In  Test  Pictures^ 
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Figure  5c.  Feature  Scaling  Curves  Relating  Subjects  Ratings  "Many  Subs”  To  Number  Of  Subs  In  Test  Pictures. 
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mean  of  the  individual  feature  ratings  should  account  for  a  significant  propor¬ 
tion  of  the  variance  in  the  overall  ratings;  and  b)  this  relationship  should  be 
improved  by  weighting  the  individual  features  by  their  importance  rating  before 
calculating  the  geometric  mean. 

In  this  experiment,  subjects  rated  the  features,  their  importance,  and 
the  overall  effectiveness  of  the  pictures  independently.  They  rated  overall 
effectiveness  at  a  different  time  from  their  other  ratings,  and  did  not  have 
available  a  record  of  their  ratings  made  at  different  times.  We  calculated 
unweighted  and  weighted  geometric  means  of  the  individual  feature  ratings.  The 
weights  were  attained  from  the  importance  ratings  by  converting  each  rating  of 
high,  medium,  and  low  to  3,  2,  and  1,  respectively,  and  then  squaring  this 
number  (a  feature  rated  high  counts  nine  times  as  much  as  one  rated  low).  The 
unweighted  and  weighted  geometric  means  accounted  for  92  and  97  percent  of  the 
variance  in  the  overall  assessments  (r  (8)  3  0.960  and  0.983  £  <  .01).  The 
relationship  between  the  weighted  features  and  the  overall  assessments  for 
experiments  1  and  2  can  be  seen  graphically  in  Figure  6.  Table  3  presents  the 
correlations  and  regression  lines  for  each  individual  subject  in  experiments  1 
and  2.  This  table  shows  that  the  relationship  shown  in  Figure  6  for  data 
averaged  over  subjects  is  also  observed  for  individuals. 

While  the  weighting  did  not  increase  by  much  the  already  high  variance 
accounted  for  by  the  unweighted  geometric  means  of  features,  weighting  did 
significantly  reduce  the  absolute  difference  between  the  overall  effectiveness 
ratings  and  the  feature  geometric  means.  Table  4  shows  the  average  overall 
effectiveness  rating  for  each  picture  and  the  predicted  overall  effectiveness 
rating  using:  a)  the  unweighted  geometric  mean,  b)  the  geometric  mean  weighted 
by  the  squares  of  the  importance  ratings,  and  c)  the  geometric  mean  weighted  by 


Figure  6.  Average  Effectiveness  As  A  Function  Of  Weighted  Geometric  Mean  Of  Feature  Ratings 
Numbering  Shows  Picture  Number,  Experiment  Condition.  Geometric  Mean  Is 
Weighted  By  Square  Of  Importance  Ratings 
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0.89 

1.20 

Average 
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0.89 
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0.10 

0.18 
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S.D. 

0.09 

0.15 

0.92 

TABLE  3. 

Individual 

correlation 

coefficients  and  least 

squares  coefficients  on 

all-out  attack.  Correlation  is  between  assessment  of  attack  effectiveness 
and  geometric  means  weighted  by  squares  of  importance  rating.  Least 
squares  fits  line: 

Average  attack  effectiveness  =  a  x  geometric  mean  of  features  +  b. 
Subject  pools  for  "low"  and  high  cases  are  different. 
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EXPERIMENT  1:  LOW  DENSITY 


EXPERIMENT  2:  HIGH  DENSITY 


TABLE  4.  Geometric  means  of  all-out  attack  feature  scores,  unweighted,  weighted  by  square 
of  importance,  and  weighted  by  feature  rating.  *Pictures  with  ship,  submarines, 
and  aircraft  that  differ  most  in  strength. 


the  actual  feature  assessment  ratings.  The  remaining  columns  show  the  dif¬ 
ference  between  the  overall  effectiveness  ratings  and  the  ratings  predicted  by 
each  weighting  process.  It  can  be  seen  that  using  a  mean  weighted  by  importance 
reduced  the  average  error  from  1.23  rating  points  (on  10-point  scale)  to  0.56 
rating  points,  a  54%  reduction  in  average  error. 

Another  interesting  result  is  the  similarity  between  the  means  weighted 
by  importance  versus  the  means  weighted  by  the  actual  feature  score.  These  two 
weighting  procedures  give  virtually  the  same  predicted  overall  rating.  This 
result  suggests  that  importance  ratings  for  these  features  are  derived  from  the 
assessed  effectiveness  of  the  feature;  that  is,  features  rated  more  charac¬ 
teristic  of  effective  attacks  were  rated  more  important  than  were  features 
regarded  as  less  characteristic  of  effective  attacks.  Because  in  this  experi¬ 
ment  rated  importance  seems  related  to  feature  weight,  this  result  also  suggests 
that  the  criteria  curves  used  to  generate  feature  scores  were  also  used  to 
generate  feature  weights. 

The  data  in  this  experiment  indicates  that  the  subjects'  schema  are 
reasonably  accurate.  These  data  show  that  subjects'  estimates  of  overall  attack 
effectiveness  approximate  the  ratings  predicted  for  the  test  pictures  based  on 
our  model  of  the  expert's  knowledge.  Table  5  compares  for  each  picture  the 
target  ratings  calculated  from  the  model  with  the  observed  attack  rating 
averaged  over  subjects  and  trials  for  each  experiment  for  each  picture.  The 
correlation  between  the  targets  and  observed  ratings  was  significant  (jr  (8)  = 
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TABLE  5.  Comparison  of  subjects'  Effectiveness  Ratings  with  ratings  produced  by 
model  of  the  expert's  knowledge.  "Actual  low”  and  "actual  high" 
refer  to  the  low  density  and  high  density  attack  experimental  conditions. 


This  experiment  was  designed  to  provide  data  on  the  feature  criteria 
curves  used  for  feature  assessment.  The  model  proposes  that  such  curves  will  be 
inferred  from  the  examples  provided  during  training.  To  test  this  proposition, 
each  training  and  test  picture  in  experiment  1  was  modified  to  contain  50%  more 
ships,  aircraft  and  submarines,  and  then  used  in  this  experiment.  If  the 
feature  assessment  curves  are  derived  solely  from  the  examples  provided  in 
training,  then  the  feature  scaling  curves  derived  from  experiments  1  and  2 
should  be  the  same  except  for  an  x-axis  scaling  factor  of  50%. 

In  addition,  since  the  procedure  was  identical  to  that  used  in  experi¬ 
ment  1,  the  experiment  provides  a  replication  of  the  data  collected  on  feature 
scaling,  weighting  and  combination  rules,  on  the  extent  to  which  subjects  learn 
the  model  we  used  to  develop  the  pictures,  and  on  the  stability  of  these  schema. 

Method 

Mater i als.  The  materials  for  this  experiment  were  the  same  as  those 
used  in  experiment  1  except  that  each  of  the  training  and  test  pictures  in 
experiment  2  contained  50%  more  platforms  (ships,  submarines,  aircraft)  than  the 
corresponding  picture  in  experiment  1.  The  additional  platforms  were  placed 
close  to  the  original  platforms  in  order  to  minimize  any  effect  on  the  perceived 
number  of  attack  axes. 

Procedure.  The  procedure  for  this  experiment  was  identical  to  that 
described  for  experiment  1. 

Subjects.  The  subjects  were  20  undergraduate  students  at  George  Mason 
University  in  Fairfax,  Virginia.  The  students  received  either  course  credit  or 
payment  for  their  participation  in  the  study. 


Results  and  Discussion 


This  experiment  was  designed  to  work  with  experiment  1  in  order  to  test 
the  properties  of  the  feature  criteria  curves  used  for  feature  assessment. 

The  experiment  also  provided  data  that  independently  supports  many  of  the 
hypotheses  examined  in  the  first  experiment. 

In  this  experiment,  all  subjects  reached  criterion.  The  mean  number  of 
trials  required  to  reach  criterion  was  1.50.  The  schema  stability  results  of 
this  experiment  resemble  those  of  the  first.  The  average  effectiveness  ratings 
for  each  picture  on  each  trial  are  shown  in  Table  1.  The  test/retest  correla¬ 
tion  between  these  ratings  is  significant  (r  (8)  =  0.977,  £  <  .01),  and  indica¬ 
tes  that  the  ratings  are  stable  over  time. 

The  data  pertaining  to  feature  weights  and  the  relationship  between  the 
assessed  attack  effectiveness  and  feature  geometric  means  also  resembled  those 
from  the  first  experiment.  We  again  found  that  both  the  unweighted  and  weighted 
geometric  means  accounted  for  more  than  90%  of  the  variance  in  the  overall 
effectiveness  ratings  and  that  the  weighted  geometric  means  accounted  for 
somewhat  more  variance  than  did  the  unweighted.  The  geometric  mean  of  the 
weighted  individual  feature  rating  is  plotted  against  the  overall  effectiveness 
ratings  in  Figure  6.  Again,  the  importance  of  weighting  is  shown  by  the  data  in 
Table  4,  as  discussed  in  the  results  section  for  experiment  1. 

Table  5  compares  the  target  ratings  calculated  for  each  picture  from 
the  model  with  the  observed  attack  effectiveness  ratings  averaged  over  subjects 
and  trials  for  each  experiment.  Again,  the  correlation  between  the  targets  and 
observed  ratings  for  this  experiment  was  significant  (r  (8)  =  0.949  £  <  .01). 


Data  collected  in  this  experiment  reveal  how  the  training  pictures 


interact  with  "commonsense"  knowledge  to  affect  the  feature  criteria  curves  used 
for  feature  assessment.  Our  initial  hypothesis  was  that  feature  criteria  curves 
are  abstracted  solely  from  the  examples  given.  If  subjects'  criteria  curves 
derive  only  from  the  examples  given  in  training,  then  the  overall  assessments 
for  the  feature  "many  platforms"  should  be  the  same  for  corresponding  pictures 
in  experiments  1  and  2  even  though  the  number  of  platforms  has  changed.  If  this 
were  the  case,  then  the  criteria  curves  inferred  from  the  pictures  used  in 
experiment  2  (with  50  percent  more  platforms)  would  differ  from  the  curves 
inferred  from  the  pictures  used  in  experiment  1  only  by  a  50%  scaling  factor  on 
the  x-axis.  The  results  suggest  that  this  was  not  the  case.  The  curves  from 
experiment  2  (Figure  5)  reflected  only  part  of  the  50%  increase  expected. 

Although  the  feature  criteria  curves  depend  only  in  part  on  the 
training  pictures,  the  overall  attack  assessments  seem  to  depend  solely  on  the 
training.  The  attack  evaluation  scores  for  corresponding  pictures  in  experi¬ 
ments  1  and  2  are  virtually  the  same. 

Experiment  3:  Barriers 

This  experiment  served  several  functions.  First,  it  reexamined  several 
issues  in  experiments  1  and  2,  providing  a  second  example  of  feature  criteria 
curves,  feature  weighting  rules,  and  feature  combination  for  situation  and 
assessment.  Second,  it  examined  several  new  issues,  providing  data  to  determine 
whether  people  use  functional  as  well  as  physical  properties  of  objects  to  make 
their  judgments,  and  to  determine  whether  people  would  apply  their  everyday 
knowledge  of  objects  to  existing  schema.  In  addition,  it  was  designed  to  exa¬ 
mine  the  relationship  between  similarity  assessments  of  pairs  of  situations  and 
their  constituent  feature  ratings. 
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Materi al s.  The  materials  for  this  experiment  consisted  of  a  set  of  ten 
training  pictures,  seventeen  test  pictures,  the  feature  evaluation  sheets, 
feature  comparison  sheets,  and  the  Raven  Progressive  Matrices  Test  (1958),  which 
was  used  as  a  distractor  task. 

The  set  of  training  pictures  and  one  set  of  test  pictures  were 
constructed  from  a  model  of  barrier  goodness  that  specified  feature  criteria 
curves  and  a  feature  weighting  rule.  Unlike  experiment  1,  this  model  was  not 
developed  from  an  expert's  model  for  barrier  evaluation.  This  model  specifies 
two  features  relevant  for  barrier  effectiveness  assessment,  the  length  of  the 
barrier  and  the  solidity  of  the  weakest  part  of  the  barrier.  Two  feature  cri¬ 
teria  curves  relate  the  measurable  properties  of  the  picture's  features 
(distance  between  two  end  ships/subs,  distance  between  two  platforms  on  either 
side  of  the  largest  internal  gap)  to  subjective  feature  scores  (barrier  length 
and  solidity).  These  relationships  are  shown  in  Table  A-3.  Overall  barrier 
effectiveness  is  calculated  from  the  weighted  geometric  mean  of  feature  scores 
attained  from  these  criteria  curves.  Since  in  our  model  of  barrier  effec¬ 
tiveness  a  barrier  was  only  as  strong  as  its  weakest  link,  the  weaker  feature  is 
weighted  by  .75  and  the  stronger  by  .25. 

For  the  first  part  of  this  experiment,  15  pictures  were  developed;  five 
pictures  were  shown  during  training  only,  five  were  shown  during  test  only,  and 
five  were  shown  both  as  training  and  test  pictures.  The  overall  ratings  ranged 
from  two  to  ten  for  both  the  training  and  test  pictures.  An  example  of  a  test 
picture  is  shown  in  Figure  7.  Table  A-4  shows  the  overall  design  for  the 
barrier  pictures. 
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Figure  7.  An  example  of  a  training  picture  for  experiment  3,  Barriers. 

Ouring  training  subjects  were  told: 

Picture  1:  Effectiveness  =  10.  The  barrier  is  both  long  and  solid.  The 
ships  at  the  two  ends  are  sufficiently  far  apart  to  make  the  barrier  dif¬ 
ficult  to  go  around.  The  platforms  are  close  enough  together  throughout 
its  entire  length  to  make  passage  through  the  barrier  very  difficult. 
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Seven  pictures  were  developed  for  the  second  part  of  the  test.  These 
pictures  were  modifications  of  pictures  shown  in  the  first  part.  Five  of  the 
pictures  were  modified  by  adding  either  an  island  or  peninsulae  to  the  picture. 
This  procedure  created  pictures  which  physically  matched  one  of  the  original 
test  pictures  in  terms  of  number  and  location  of  platforms,  but  functionally 
matched  a  second  original  test  picture,  in  terms  of  length  and  solidity  of  the 
barrier.  An  example  of  a  test  picture  from  this  set  and  the  physically  and 
functionally  equivalent  pictures  is  shown  in  Figure  8.  Two  other  new  test  pic¬ 
tures  were  created  by  taking  two  of  the  original  test  pictures  and  moving  the 
platforms  to  one  side,  so  that  they  were  no  longer  centered  in  front  of  the 
battle  group.  Again,  this  created  pictures  which  were  physically  similar  to  the 
one  of  the  original  test  pictures,  but  functionally  similar  to  another  original 
test  picture. 

The  feature  evaluation  sheets  for  this  experiment  were  similar  to  those 
used  for  the  all-out  attacks.  This  sheet  listed  six  features  intended  as 
"physical",  "intermediate"  and  "functional"  representations  of  the  barrier 
length  and  solidness.  The  functional  features  are  "barrier  is  hard  to  go 
around"  and  "barrier  is  hard  to  go  through".  The  intermediate  features  are 
"barrier  length"  and  "barrier  solidness".  The  physical  features  are  "distance 
between  end  ships/subs"  and  "distance  between  ships /subs  on  either  side  of  the 
largest  internal  gap". 

Feature  comparison  sheets  were  used  to  assess  the  similarity  between 
features  depicted  in  pairs  of  test  pictures.  Each  page  contained  two  pictures 
of  barriers,  one  from  the  original  test  set  and  one  from  the  second  test  set. 
Below  the  pictures  was  a  list  of  the  six  features  relevant  to  determining  the 


effectiveness  of  a  barrier.  Subjects  were  asked  to  rate  how  similar  the  first 
barrier  was  to  the  second  with  respect  to  each  of  these  features. 

Procedure.  The  procedure  for  this  experiment  was  the  same  as  that 
described  for  Experiment  1  with  the  following  exceptions.  Subjects  in  this 
experiment  saw  only  10  training  pictures  and  worked  on  the  puzzles  which  served 
as  a  distractor  for  only  10  minutes.  After  completing  the  feature  ratings  for 
the  test  pictures,  the  subjects  were  asked  to  make  effectiveness  and  confidence 
ratings  on  seventeen  additional  pictures,  and  later  were  asked  to  complete 
feature  evaluation  sheets  on  each  of  the  new  test  pictures.  Finally,  they  were 
presented  with  pairs  of  barrier  pictures  and  were  asked  to  rate  the  similarity 
of  the  each  of  the  features  presented  in  the  two  barriers. 

Subjects.  The  subjects  were  20  undergraduate  students  at  George  Mason 
University  in  Fairfax,  Virginia.  The  students  received  either  course  credit  or 
payment  for  their  participation  in  the  study. 

Results  and  Discussion 

Again  in  this  experiment,  all  of  the  subjects  reached  criterion.  The 

mean  number  of  trials  require  to  reach  criterion  was  1.15.  In  addition,  the 
► 

data  from  this  experiment  suggest  that  subjects  abstracted  from  the  examples  the 
model  used  to  develop  the  pictures.  The  geometric  mean  of  the  features'  scores, 
weighted  according  to  the  weighting  rule  used  to  construct  the  pictures, 
accounted  for  88  percent  of  the  variance  in  the  overall  effectiveness  ratings  (_r 
(8)  -  0.937,  £  <  .01).  Figures  9  and  10  depict  this  relationship  for  each  of 
the  two  test  sets.  The  subjects'  assessments  of  barrier  quality  for  test  pic- 
tues  seen  earliwe  in  training  and  the  new  test  pictures  not  seen  in  training 
both  approximated  the  model's  assessments  equally  well.  This  result  suggests 
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that  subjects  have  internalized  a  barrier  assessment  process  and  are  not  just 
remembering  the  pictures  that  they  saw  during  the  training  period. 

The  data  on  functional  substitution  indicate  that  newly  formed  schema 
interact  closely  with  other  knowledge  in  memory.  During  the  training  for  this 
experiment,  subjects  never  saw  islands  or  peninsulae  as  part  of  the  barrier. 

The  data  from  the  ratings,  however,  suggest  that  subjects  incorporated  their 
existing  knowledge  of  the  properties  of  land  masses  into  their  overall  effec¬ 
tiveness  ratings. 

Table  6  shows  a  schematic  representation  of  each  test  picture,  and  the 
effectiveness  ratings  given  to  that  picture  and  to  its  physical  and  functional 
equivalents.  It  can  be  seen  from  the  table  that  in  the  island/peninsula  group 
the  functionally  equivalent  pictures  provided  a  better  match  to  the  initial 
ratings  than  did  the  physically  equivalent  pictures,  except  for  one  picture.  A 
closer  examination  of  this  picture  suggests  that  the  functional  equivalent  cho¬ 
sen  for  this  picture  was  not  appropriate  because  the  new  test  item  allows  safe 
passage  through  the  internal  gap,  while  such  safe  passage  is  not  provided  by  the 
proposed  functional  equivalent  for  this  picture. 

The  "off-center"  test  pictures,  where  the  ships  were  displaced  to  the 
side,  did  not  produce  as  clear  results  on  this  functional/physical  equivalency. 
For  those  pictures  the  barrier  ratings  do  not  match  the  ratings  of  the  func¬ 
tional  equivalents  better  than  do  the  ratings  of  the  "look  alikes". 

A  comparison  of  the  subjects'  responses  for  physical,  intermediate,  and 
functional  features  suggests  the  point  in  the  information  processing  sequence 
at  which  subjects  use  the  functional  properties  of  land  masses  in  their  barrier 
assessments.  Table  7  shows  that  the  weighted  geometric  means  of  the  physical, 
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TABLE  6.  Comparison  of  ratings  of  off-center,  island,  and  peninsula  barriers 
with  their  functional  and  "looks  like"  counterparts.  Pictures  11 
through  15  have  islands  and  peninsulae  added.  In  pictures  16  and  17 
the  barriers  are  displaced  off-center. 
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TABLE  7.  Physical,  Intermediate,  and  functional  features  as  predictors  of  sub¬ 
ject's  assessments  of  barrier  effectiveness.  Average  is  average  over 
test  pictures.  Geometric  mean  weights  are  .75  for  weaker  feature  and 
.25  for  stronger  feature. 


intermediate,  and  functional  features  predict  the  overall  effectiveness  ratings 
about  equally  well.  Because  the  physical  features  predicted  overall  assessments 
as  well  as  did  the  functional  assessments,  subjects  must  have  taken  the  islands 
and  peninsulae  into  account  when  rating  the  distances  between  the  two  end 
ships/subs  or  the  two  platforms  on  either  side  of  the  largest  internal  gap.  This 
result  implies  that  the  conversion  of  land  masses  to  ship  equivalents  occurs 
before  the  criteria  curves  were  applied  to  the  measureable  feature  properties. 

The  actual  feature  criteria  curves  used  by  the  subjects  are  shown  in 
Figure  11.  The  figure  shows  that  the  subjects'  curves  do  not  replicate  curves 
used  in  the  model.  The  curves  for  barrier  length  are  generally  too  high,  while 
the  curves  for  barrier  solidness  are  generally  too  low.  This  lack  of  agreement 
with  the  model  is  not  surprising.  During  the  training  session,  the  subjects  are 
not  told  how  much  each  feature  contributes  to  the  overall  effectiveness  rating. 
Since  there  are  many  combinations  of  length  and  solidness  ratings  whose 
geometric  mean  aproximates  the  overall  effectiveness  ratings,  it  is  not  possible 
for  the  subjects  to  infer  the  combination  used  by  the  model. 

As  in  experments  1  and  2,  the  weighted  geometric  means  accounted  for 
most  of  the  variance  in.  the  barrier  effectiveness  assessments.  Here  they 
accounted  for  93  percent  of  the  variance  (_r  (8)  =  0.965  £  <  .01)  in  the  target 
ratings.  Again,  the  weighted  means  more  accurately  predict  the  overall  effec¬ 
tiveness  ratings  than  the  unweighted  means.  As  shown  in  Table  8,  the  unweighted 
mean  overestimates  the  effectiveness  ratings  by  0.44  for  the  standard  test  pic¬ 
tures  and  0.6  for  the  island/off-center  test  pictures.  Using  the  weighting  pro¬ 
cedure  reduces  the  error  to  -0.09  for  the  standard  test  pictures  and  to  0.12  for 
the  island/off-center  test  pictures. 
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Figure  1 1.  Feature  Criteria  Curves  Relating  Physical,  Intermediate  And  Functional  Feature 
Assessments  To  Measurable  Properties  Of  The  Barrier  Features. 
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TABLE  8.  Average  ratings  compared  with  unweighted  and  weighted  geometric  means 
of  feature  ratings. 


Table  8  also  shows  that  in  the  barrier  pictures  importance  ratings  cannot 


be  equated  with  the  feature  strength  as  measured  by  feature  assessment  scores. 
Using  importance  as  a  weighting  factor  in  this  experiment  led  to  a  worse  fit  bet¬ 
ween  feature  geometric  mean  and  barrier  assessment  than  using  the  unweighted  mean. 
This  result  suggests  that,  in  general,  importance  rating  is  a  combination  of  the 
true  feature  weight  and  feature  strength.  In  this  case,  the  weighting  rule  used 
to  construct  the  training  and  test  pictures  produced  the  best  fit.  This  result 
shows  that  subjects  inferred  from  the  training  that  barrier  quality  is  primarily 
determined  by  its  weaker  component. 


The  analysis  of  feature  similarity  data  failed  to  reveal  any  special 
relationship  between  the  ratings  of  features  in  test  pictures  and  the  similarity 
of  these  features  to  features  in  the  training  pictures.  As  expected,  features 
rated  similar  received  similar  feature  ratings.  Those  rated  dissimilar, 
however,  could  also  receive  similar  ratings.  This  result  presumably  reflects 
the  fact  the  features  rated  equally  strong  can  be  strong  in  different  ways. 


General  Discussion 


The  data  in  these  experiments  resolved  most  of  the  issues  described 
under  "critical  model  issues".  This  discussion  reviews  the  data  support  for  the 
alternatives  described  for  each  of  these  critical  issues. 

Ease  of  learning,  stability,  and  accuracy  of  schema  abstracted  from 
examples.  The  data  pertaining  to  these  issues  confirm  the  proposed  schema  model 
of  situation  assessment.  In  each  of  the  three  experiments  subjects  acquired  the 
schema  easily,  as  measured  by  trials  to  criteria  during  training.  In  each,  they 
retained  a  stable  schema  throughout  the  experimental  session,  as  measured  by  the 
consistency  of  situation  assessments  made  at  different  times.  In  each,  their 
schema  captured  the  model  used  to  develop  the  training  pictures,  as  measured  by 
the  similarity  of  their  assessments  to  the  model's  ratings.  The  inclusion  of 
training  pictures  in  the  barrier  test  set  provided  additional  evidence  that  sub¬ 
jects  had  based  their  assessments  on  a  schema-like  model  rather  than  by  remem¬ 
bering  specific  instances.  The  subjects'  estimates  of  situation  quality  for  the 
pictures  not  seen  in  training  approximated  the  model's  rating  for  those  pictures 
as  closely  as  did  their  estimates  for  the  pictures  seen  in  training. 

Assessment  of  feature  relevance  and  functional  substitution.  The  model 
proposed  that  subjects,  when  shown  pictures  that  contain  familiar  objects  not  in 
the  training  pictures,  would  consider  these  objects  in  their  situation 
assessments.  The  model  proposed  that  people  would  use  general  knowledge  about 
islands  and  peninsulae  (ships  cannot  pass  over  land)  and  off-centeredness 
(easier  to  go  around)  in  evaluating  the  quality  of  barriers.  The  proposition 
that  subjects'  schema  for  barriers  would  accommodate  islands  and  peninsulae  and 
unusual  barrier  placement  was  confirmed.  With  one  exception,  people's 


assessments  of  barrier  quality  was  much  closer  to  the  barrier  rating  of  the 
functional  equivalent  than  of  the  look-alike.  It  is  easily  argued,  however, 
that  in  the  one  case  where  their  assessment  was  closer  to  the  look-alike,  the 
barrier  would  not  in  fact  function  like  the  presumed  functional  equivalent. 

The  data  do  not  support  the  conjecture  that  people  attempt  to  evaluate 
the  barriers  using  physical  features,  and  use  functional  features  only  when  the 
physical  features  lead  to  an  assessment  of  low  barrier  quality.  This  conjecture 
would  have  been  supported  had  people's  assessments  of  barrier  quality  been  pre¬ 
dicted  from  the  functional  features  (hard  to  pass  through,  hard  to  go  around) 
but  not  from  the  physical  features  (distance  between  ships  on  either  side  of 
largest  internal  gap,  distance  between  end  ships).  Indeed,  the  data  showed  no 
indication  that  functional  features  alone  contribute  to  barrier  assessment  when 
new  objects  are  introduced  into  the  barriers.  Rather,  physical  and  functional 
features  always  were  equally  good  predictors  of  overall  assessments.  The  means 
of  physical  and  functional  feature  scores  were  extremely  close  for  all  pictures. 

On  the  other  hand,  the  data  do  support  the  existence  of  a  very  early 
information  processing  step  in  which  new  objects  not  seen  in  training  are  men¬ 
tally  replaced  with  a  functional  equivalent  of  objects  seen  in  the  training. 

The  fact  that  physical  and  functional  feature  scores  were  so  close  suggests  that 
the  subjects,  when  answering  questions  about  ship  distances,  were  already  taking 
into  account  the  effects  of  islands  and  peninsulae,  perhaps  by  treating  the 
islands  and  peninsulae  as  additional  ships.  The  early  functional  replacement  of 
"nonstandard"  objects  with  standard  ones  is  attractive,  for  it  seems  to  increase 
the  general  applicability  of  schema  while  minimizing  the  schema  memory  require¬ 
ments.  If  nonstandard  units  are  converted  to  the  units  used  by  the  physical 
feature  criteria  curve,  then  the  curve  data  can  be  stored  more  economically  than 


if  a  separate  curve  is  required  for  every  kind  of  object  that  can  contribute  to 
barrier  length  or  solidness. 

Feature  assessment  and  scoring— use  of  criteria  curves.  Feature  scoring 
is  the  conversion  of  measurable  picture  properties,  such  as  the  number  of 
aircraft  in  an  all-out  attack,  to  a  subjective  estimate  of  the  contribution  of 
that  feature  to  a  strong  all-out  attack.  The  model  proposes  that  feature 
scoring  is  accomplished  by  evaluating  features  by  means  of  the  criteria  curves 
stored  within  the  schema. 

The  data  suggest  that  feature  scoring  is  an  important  step  in  situation 
assessment.  The  curves  themselves  are  simple  monotonic  functions  of  the 
measurable  feature  property,  and  the  feature  values  obtained  from  these  curves 
seem  to  be  used  in  the  overall  assessments. 

A  comparision  of  the  feature  criteria  curves  attained  in  the  first  two 
experiments,  the  low  and  high  density  all-out  attacks,  shows  that  the  feature 
criteria  curves  inferred  by  the  subjects  are  derived  from  a  combination  of  the 
training  materials  and  general  knowledge  not  part  of  the  training. 

The  all-out  high  density  and  all-out  low  density  experiments  differed 
only  in  the  numbers  of  platforms  in  the  pictures.  Every  test  and  training  pic¬ 
ture  in  the  high  set  was  identical  to  a  corresponding  picture  in  the  low  set, 
except  that  the  high  set  contained  50%  more  of  each  platform  type.  The  words 
used  to  describe  the  pictures  were  identical,  and  the  geographic  arrangement  of 
platforms  were  as  similar  as  possible.  In  these  two  experiments  the  average  of 
subjects'  ratings  of  corresponding  pictures  in  the  two  experiments  were  vir¬ 
tually  identical,  as  were  the  weighted  geometric  means  of  the  feature  scores. 
Because  the  number  of  platforms  differed,  but  the  subjects'  answers  were  similar 


on  corresponding  experiment  1  and  experiment  2  pictures,  the  schema  formed  in 
experiments  1  and  2  must  be  different.  There  are  three  places  where  this  dif¬ 
ference  could  occur:  in  the  criteria  curves  used  for  feature  assessment  and 
scoring,  in  the  relative  weighting  of  the  "many"  and  "multi-axis"  features,  and 
in  the  relative  weighting  of  the  ships,  subs,  and  aircraft  overall  threat 
features. 

If  the  difference  was  solely  in  the  feature  criteria  curves,  then  the 
score  for  the  features  "many  ships",  "many  submarines",  and  "many  aircraft" 
assigned  to  n  platforms  in  the  low  all-out  attack  experiments  would  also  be 
assigned  to  1.5  x  n  platforms  in  the  high  all-out  attack  experiments.  Instead, 
n  platforms  in  the  low  set  got  the  same  score  as  a  x  n  platforms  in  high  set, 
with  a  =  1.22  for  ships,  1.28  for  air,  and  1.34  for  submarines.  The  difference 
between  these  numbers  may  reflect  a  contribution  from  the  usual  notion  of 
"many";  nine  ships  fits  the  natural  category  "many  ships"  better  than  does  six 
ships. 

These  numbers  indicate  that  while  the  feature  criteria  curves  account 
for  much  of  the  difference,  feature  combination  and  feature  weighting  are  also 
important.  In  these  experiments,  the  initial  feature  combination  rule,  com¬ 
bining  "many"  with  "multi-axis"  to  yield  "overall"  for  the  ships,  submarine, 
and  aircraft  features  made  up  half  of  the  difference.  The  rest  was  made  up  by 
subjects'  weighting  features  that  received  high  scores  more  in  the  low  all¬ 
attack  cases  than  in  the  high  all-out  attack  cases. 

The  feature  criteria  curves  inferred  by  the  subjects  do  not  replicate 
the  curves  in  the  model  used  to  construct  the  training  and  test  pictures,  par¬ 
ticularly  in  the  barrier  pictures.  Such  replication  is  not  expected,  of  course, 


given  the  amount  of  information  provided  in  the  training  about  the  relative 
contribution  of  different  factors.  In  the  training  the  subjects  were  told  only 
the  overall  picture  quality  and  the  names  of  the  features  that  are  weak  or  strong. 
Since  they  were  never  given  any  numerical  information  relating  particular 
feature  characteristics  to  corresponding  feature  scores,  and  since  there  are 
many  combinations  of  feature  scores  and  combination  rules  able  to  produce  each 
picture  value,  subjects  did  not  have  the  information  necessary  to  infer  the 
actual  feature  scaling  curves  used  to  develop  the  pictures. 

Feature  weighting.  A  comparison  of  the  results  in  the  three  experiments 
shows  that  weighted  geometric  means  of  the  features  predict  attack  and  barrier 
assessments  better  than  do  the  unweighted  means,  that  subjects  use  simple 
schema-specific  rules  to  attain  the  weights,  and  that  the  feature  importance 
scores  reflect  the  feature's  strength  (feature  assessment  score)  as  well  as  its 
weight  in  the  geometric  mean. 

The  rule  for  attaining  weights  in  the  all-out  and  barrier  cases  were 
significantly  different.  For  the  all-out  attacks,  the  weights  were  the  feature 
assessment  scores.  For  the  barriers  the  weights  were  .75  for  the  weaker  feature 
and  .25  for  the  stronger  one.  The  existence  of  simple  rules  for  assigning 
feature  weights  avoids  a  requirement  for  special  criteria  curves  or  complex 
information  processing  methods  for  weight  determination.  Using  such  simple 
rules  conserves  both  memory  and  information  processing  resources. 

When  these  experiments  were  designed  it  was  thought  that  feature  weights 
would  be  closely  related  to  subjects'  ratings  of  feature  importance.  This  rela¬ 
tionship  was  observed  in  the  all-out  attack;  it  was  not  observed  in  the  barrier 
experiments.  In  fact,  what  was  observed  was  that  "importance"  was  a  confounding 


of  two  factors:  feature  assessment  score  (how  characteristic  that  feature  is  of 
a  strong  attack  or  barrier)  and  feature  weight.  In  the  all-out  attack,  these 
two  factors  correlated  and  the  weight  seemed  to  be  derived  from  the  assessment 
score.  In  the  barriers,  the  weighting  rule  that  worked  was  the  one  used  to 
develop  the  training  picture.  It  rated  a  barrier's  strength  mostly  from  its 
weaker  component.  For  barriers,  weighting  features  by  assessment  score  reduced 
the  correspondence  between  the  weighted  geometric  mean  of  features  and  the 
assessments  of  barrier  quality. 

Feature  combination.  All  three  of  the  experiments  tested  the  extent  to 
which  the  weighted  geometric  mean  of  the  feature  scores  predicted  subjects' 
assessments  of  attack  or  barrier  quality.  In  all  three  cases,  the  correlation 
between  the  attack  and  barrier  quality  assessments  and  weighted  feature 
geometric  means,  averaged  over  subjects,  exceeded  0.97.  In  addition,  the  abso¬ 
lute  difference  between  the  weighted  means  and  the  quality  assessments  was  very 
small,  averaging  about  .35  over  all  experiments. 

Conclusions  and  Further  Applications.  The  proposed  schema  and  information  pro¬ 
cessing  model  provide  an  excellent  explanation  of  the  subjects'  performance  in 
these  situation  assessment  task.  The  subjects  formed  the  schema  from  a  sequence 
of  examples  described  in  terms  of  features.  Their  assessments  of  the  overall 
situation  appeared  to  be  derived  from  their  assessments  of  the  situation 
features,  and  these  seemed  to  be  based  on  objective  measurable  properties  of  the 
presented  pictures.  Further,  the  schema  so  formed  linked  easily  with  concepts 
subjects  had  obtained  previous  to  the  training.  Subjects  used  these  concepts  in 


While  it  is  not  likely  that  this  specific  model  will  be  equally  useful 
for  understanding  every  kind  of  situation  assessment  task,  it  is  possible  that 


variants  will  prove  useful  for  a  broad  range  of  such  tasks.  For  example, 
scripts,  which  have  been  shown  useful  for  understanding  social  situations,  are  a 
variant  of  the  proposed  model.  The  features  of  scripts  are  events  and  the  time 
relationship  among  events.  Their  feature  criteria  curves  address  the  charac¬ 
teristics  of  the  script  events  and  time  arrangements. 

It  is  also  possible  that  the  very  simple  schema  presented  here  will 
prove  to  be  important  building  blocks  of  more  elaborate  structures  requiring  a 
more  extensive  set  of  related  schema.  These  structures  may  have  several  levels 
of  schema  hierarchy,  and  may  include  schema  composed  of  more  abstract  features. 

Schema  for  situation  assessment  support  "intuitive"  decision  making. 
This  kind  of  decision  making  is  based  on  recognizing  that  an  observed  situation 
is  similar  to  other  situations  in  which  particular  decisions  or  strategies 
generally  work  well.  "Intuitive"  decision  making  requires  data  in  memory  that 
supports  the  necessary  similarity  assessment.  The  schema  examined  in  these 
experiments  may  play  an  important  role  in  such  assessments. 
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APPENDIX  A 


FEATURE  CRITERIA  CURVES  FOR  SHIPS 
OBJECTIVE  MEASURE  SUBJECTIVE  SCORE 

Number  of  Ships  "Many  Ships" 


4  -  5 


5  -  6 


To  attain  overall  strength  add  1  or  2  to  "many  ships"  if  multi-axis  is  two. 

FEATURE  CRITERIA  CURVES  FOR  AIRCRAFT 
OBJECTIVE  MEASURE  SUBJECTIVE  SCORE 

Number  of  Aircraft  "Many  Aircraft" 


6  -  9 


13  -  15 


To  attain  overall  strength,  subtract  1  or  2  from  "many  aircraft"  if  multi-axis 
is  one. 


FEATURE  CRITERIA  CURVES  FOR  SUBMARINES 
NUMBER  SURROUNDEDNESS 

OBJECTIVE  MEASURE  SUBJECTIVE  SCORE  OBJECTIVE  MEASURE  SUBJECTIVE  SCORE 

Number  of  Submarines  "Many  Submarines"  Number  of  Quadrants  "Multi -axis" 

Covered 


Overall  strength  is  geometric  mean  of  "many  submarines"  and  "multi-axis." 


Table  A-l.  Construction  of  test  and  trailing  picture?  for  all-out  attacks: 
criteria  curve  data  used  for  feature  scoring 
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TABLE  A-2.  Design  for  Test  Pictures  for  Experiment  1.  The  design  for  experiment  two  is 

the  same,  except  that  each  picture  has  50%  more  ships,  aircraft  and  submarines. 


FEATURE  CRITERIA  CURVES  FOR  BARRIERS 


OBJECTIVE 

LENGTH 

SUBJECTIVE  OBJECTIVE 
f (L)  GAP* 

SUBJECTIVE 

g(G) 

1” 

1 

6 

1 

2 

2 

5 

2 

3 

3 

4 

4 

4 

4 

3 

5 

5 

5 

2 

6 
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7 

1 

8 

7 

9 

0 

10 

8 

10 

♦Add  1.5"  to  gap 
to  find  physical 
separation  between 
platforms  bordering 
longest  internal  gap. 

SCORE  FOR 

BARRIER 

EFFECTIVENESS 

f(L)P  g(G)1_P 

where  F  =  .75  if  f(L)  <  g(G) 
P  =  .25  otherwise 


TABLE  A-3.  Construction  of  test  and  training  pictures  for  barriers;  criteria 
curve  data  used  for  feature  scoring. 
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TABLE  A-4a.  Design  for  test  and  training  pictures  in  experiment  3. 


ISLANDS  AND  OFF-CENTERS 
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TABLE  A-4b.  Design  of  test  and  training  pictures  for  experiment  3. 

Islands,  Barriers,  and  Off-Centers. 
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